Predicting non-coding RNA genes in Escherichia coli with boosted genetic programming
نویسندگان
چکیده
Several methods exist for predicting non-coding RNA (ncRNA) genes in Escherichia coli (E.coli). In addition to about sixty known ncRNA genes excluding tRNAs and rRNAs, various methods have predicted more than thousand ncRNA genes, but only 95 of these candidates were confirmed by more than one study. Here, we introduce a new method that uses automatic discovery of sequence patterns to predict ncRNA genes. The method predicts 135 novel candidates. In addition, the method predicts 152 genes that overlap with predictions in the literature. We test sixteen predictions experimentally, and show that twelve of these are actual ncRNA transcripts. Six of the twelve verified candidates were novel predictions. The relatively high confirmation rate indicates that many of the untested novel predictions are also ncRNAs, and we therefore speculate that E.coli contains more ncRNA genes than previously estimated.
منابع مشابه
Predicting Non-protein-coding RNA Genes in Escherichia Coli Using SVM with Signature Descriptor
Non-protein-coding RNA (ncRNA) genes are known to play significant roles. Along with transfer RNAs, ribosomal RNAs and mRNAs, ncRNAs contribute to gene splicing, nucleotide modification, protein transport and regulation of gene expression. Several methods exist for predicting ncRNA genes in Escherichia coli (E.coli). In this paper, we describe a very general, highthroughput method for predictin...
متن کاملComparison of ertapenem non-susceptibility with 2-mercaptopropionic acid phenotypic tests in predicting NDM-1 and IMP-1 production in clinical isolates of Escherichia coli
Background: A routine phenotypic test has not been recommended for detection of metallo-β-lactamases (MBLs) producing Enterobacteriaceae species such as Escherichia coli. The current study was conducted to compare the 2-Mercaptopropionic acid (2-MPA) phenotypic method and ertapenem non-susceptibility test with polymerase chain reaction in predicting the production of MBLs in clinical isolates o...
متن کاملPhylogenetic Analysis of Three Long Non-coding RNA Genes: AK082072, AK043754 and AK082467
Now, it is clear that protein is just one of the most functional products produced by the eukaryotic genome. Indeed, a major part of the human genome is transcribed to non-coding sequences than to the coding sequence of the protein. In this study, we selected three long non-coding RNAs namely AK082072, AK043754 and AK082467 which show brain expression and local region conservation among vertebr...
متن کاملPredicting Antisense Rnas in the Genomes of escherichia Coli and salmonella Typhimurium Using Promoter-search Algorithm Platprom
A pattern recognition software PlatProm, which takes into consideration both sequence-specific and structure-specific features in the genetic environment of the promoter sites and identifies transcription start points with a very high accuracy was used to reveal potentially transcribed regions in the genomes of two bacterial species. Along with the expected promoters located upstream from codin...
متن کاملTranscriptome Sequencing of Guilan Native Cow in Comparison with bosTau4 Reference Genome
RNA-sequencing is a new method of transcriptome characterization of organisms. Based on identity and relatedness, there are large genetic variations among different cattle breeds. The goal of the current study was to sequence the transcriptome of Guilan native cow and compare with available reference genome using RNA-sequencing method. Blood samples were collected from 14 Guilan native cows and...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Nucleic Acids Research
دوره 33 شماره
صفحات -
تاریخ انتشار 2005